One of the cool things about building web applications is the ability to either upload or download files from the web app.
In this tutorial we will be exploring streamlit file upload feature. With this feature you can upload any type of files and use them in your app. To work with the file uploads you will have to use the st.file_uploader() function.
Let us see how the st.file_uploader() functions works.
First of all let us explore the various features of the st.file_uploader()
Ability to specify the type of file you want to allow(type=[]): This feature is quite useful as it gives you a form of security out of the box with little code. Hence unspecified file types are disallowed and restricted when the user uploads a file.Ability to receive multiple files: (accept_multiple_files=True): With this feature you can accept multiple file as well as even select multiple files and upload them.Ability to specify the limit of files to upload : By default the maximum limit is 200mb but Streamlit allows you to change the limit.These are the commonest features out of the box but there are some other nice features that needs explanation.
Let us see how the file_uploader() works.
The UploadedFile ClassAny file uploaded will be seen under the UploadedFile Class which in this context is a subclass of ByteIO . This is ‘file-like ‘ so in case you are working with it – you are to treat it like how you will treat a file. Therefore you will have to use the right file processing library or package to process or read your uploaded file.
Hence if it is an image you will have to use PIL or OpenCV ,etc.
Filetype and Their Respective PackagesImage(PNG,JPEG,JPG): PIL,OpenCV,Scikit-Image
CSV: Pandas, DataTable,CSV
PDF: pyPDF2,pdfplumber,etc
Docx: Docx2Txt,python-docx,textract,
Getting the FIlename and Filesize
With the UploadedFile Class you can get the original filename, filesize and filetype. Doing a dir(uploaded_file) will show all the attributes and methods of this class. Hence you can get the original details.
For the original filename,filesize and filetype you can use the code below
uploaded_file = st.file_uploader("Upload Files",type=['png','jpeg'])if uploaded_file is not None:file_details = {"FileName":uploaded_file.name,"FileType":uploaded_file.type,"FileSize":uploaded_file.size} st.write(file_details)This is the same thing that shows just below the drag and drop section when you upload a file.
Reading Each File TypeTo make it easy I have built a simple file_upload app and you can check the code below
import streamlit as stimport streamlit.components.v1 as stc# File Processing Pkgsimport pandas as pdimport docx2txtfrom PIL import Image from PyPDF2 import PdfFileReaderimport pdfplumberdef read_pdf(file):pdfReader = PdfFileReader(file)count = pdfReader.numPagesall_page_text = ""for i in range(count):page = pdfReader.getPage(i)all_page_text += page.extractText()return all_page_textdef read_pdf_with_pdfplumber(file):with pdfplumber.open(file) as pdf:page = pdf.pages[0]return page.extract_text()# import fitz # this is pymupdf# def read_pdf_with_fitz(file):# with fitz.open(file) as doc:# text = ""# for page in doc:# text += page.getText()# return text # Fxn@st.cachedef load_image(image_file):img = Image.open(image_file)return img def main():st.title("File Upload Tutorial")menu = ["Home","Dataset","DocumentFiles","About"]choice = st.sidebar.selectbox("Menu",menu)if choice == "Home":st.subheader("Home")image_file = st.file_uploader("Upload Image",type=['png','jpeg','jpg'])if image_file is not None:# To See Details# st.write(type(image_file))# st.write(dir(image_file))file_details = {"Filename":image_file.name,"FileType":image_file.type,"FileSize":image_file.size}st.write(file_details)img = load_image(image_file)st.image(img,width=250,height=250)elif choice == "Dataset":st.subheader("Dataset")data_file = st.file_uploader("Upload CSV",type=['csv'])if st.button("Process"):if data_file is not None:file_details = {"Filename":data_file.name,"FileType":data_file.type,"FileSize":data_file.size}st.write(file_details)df = pd.read_csv(data_file)st.dataframe(df)elif choice == "DocumentFiles":st.subheader("DocumentFiles")docx_file = st.file_uploader("Upload File",type=['txt','docx','pdf'])if st.button("Process"):if docx_file is not None:file_details = {"Filename":docx_file.name,"FileType":docx_file.type,"FileSize":docx_file.size}st.write(file_details)# Check File Typeif docx_file.type == "text/plain":# raw_text = docx_file.read() # read as bytes# st.write(raw_text)# st.text(raw_text) # failsst.text(str(docx_file.read(),"utf-8")) # emptyraw_text = str(docx_file.read(),"utf-8") # works with st.text and st.write,used for futher processing# st.text(raw_text) # Worksst.write(raw_text) # workselif docx_file.type == "application/pdf":# raw_text = read_pdf(docx_file)# st.write(raw_text)try:with pdfplumber.open(docx_file) as pdf:page = pdf.pages[0]st.write(page.extract_text())except:st.write("None")elif docx_file.type == "application/vnd.openxmlformats-officedocument.wordprocessingml.document":# Use the right file processor ( Docx,Docx2Text,etc)raw_text = docx2txt.process(docx_file) # Parse in the uploadFile Class directoryst.write(raw_text)else:st.subheader("About")st.info("Built with Streamlit")st.info("Jesus Saves @JCharisTech")st.text("Jesse E.Agbe(JCharis)")if __name__ == '__main__':main()You can also check out the video tutorial below
Thanks For Your Time
Jesus Saves
By Jesse E.Agbe(JCharis)